While the problem of hallucinations in neural machine translation has long been recognized, so far the progress on its alleviation is very little. Indeed, recently it turned out that without artificially encouraging models to hallucinate, previously existing methods fall short and even the standard sequence log-probability is more informative. It means that characteristics internal to the model can give much more information than we expect, and before using external models and measures, we first need to ask: how far can we go if we use nothing but the translation model itself ? We propose to use a method that evaluates the percentage of the source contribution to a generated translation. Intuitively, hallucinations are translations "detached" from the source, hence they can be identified by low source contribution. This method improves detection accuracy for the most severe hallucinations by a factor of 2 and is able to alleviate hallucinations at test time on par with the previous best approach that relies on external models. Next, if we move away from internal model characteristics and allow external tools, we show that using sentence similarity from cross-lingual embeddings further improves these results.
translated by 谷歌翻译
我们介绍了第一个用于濒危Erzya语言与俄语以及我们为训练和评估它收集的数据集的神经机器翻译系统。BLEU分别分别为Erzya和Russian的BLEU分数分别为17和19,其中一半以上的翻译被以母语为母语的人可以接受。我们还调整了模型以在Erzya和其他10种语言之间转换,但是如果没有其他并行数据,这些方向上的质量仍然很低。我们将翻译模型与收集的文本语料库一起发布,新的语言标识模型以及适合Erzya语言的多语言句子编码器。这些资源将在https://github.com/slone-nlp/myv-nmt上找到。
translated by 谷歌翻译
文本样式转移技术在自然语言处理中越来越受欢迎,找到了各种应用,例如文本排毒,情感或形式转移。但是,大多数现有方法都经过在公共平台,音乐或娱乐上的在线通信等领域进行了测试,但它们都不适用于典型的面向任务生产系统的域,例如个人计划安排(例如,预订航班或在餐厅预订桌子)。我们通过研究该域中的形式转移来填补这一空白。我们指出,该域中的文本充满了指定的实体,这对于保持文本的原始意义非常重要。确实,例如,如果有人传达了航班的目的地城市,则不得更改。因此,我们专注于指定实体在形式文本样式转移方面的内容保存中的作用。我们收集一个新数据集,以评估文本样式传输中内容相似性度量。它取自以任务为导向的对话的语料库,其中包含许多与现实请求有关的重要实体,这些实体使该数据集在生产中使用之前,对于测试样式传输模型特别有用。此外,我们对预训练的形式传输模型进行了错误分析,并引入了一种简单的技术,以使用有关命名实体的信息来增强文本样式传输中使用的基线内容相似性度量的性能。
translated by 谷歌翻译
医疗人工智能(AI)的最新进展已提供了可以达到临床专家水平绩效的系统。但是,当在与训练环境不同的临床环境中评估时,这种系统往往会证明次优的“分布式”性能。一种常见的缓解策略是使用特定地点数据为每个临床环境开发单独的系统[1]。但是,这很快变得不切实际,因为医疗数据很耗时,可以注释且昂贵[2]。因此,“数据有效概括”的问题给医学AI开发带来了持续的困难。尽管代表性学习的进展显示出希望,但并未对其好处进行严格的研究,特别是用于分布的设置。为了应对这些挑战,我们提出了RESEDIS,这是一种统一的代表学习策略,以提高医学成像AI的鲁棒性和数据效率。雷雷迪斯使用大规模监督转移学习与自我监督学习的通用组合,几乎不需要特定于任务的自定义。我们研究各种医学成像任务,并使用回顾性数据模拟三个现实的应用程序场景。 RESEDIS表现出明显改善的分布性能,而在强有力的基线上,诊断准确性相对相对提高了11.5%。更重要的是,我们的策略会导致对医学成像AI的强大数据有效的概括,并使用跨任务的1%至33%的重新培训数据匹配强有力的监督基线。这些结果表明,Repedis可以显着加速医学成像AI开发的生命周期,从而为医学成像AI提供了重要的一步,以产生广泛的影响。
translated by 谷歌翻译
组织病理学癌症诊断已经变得更加复杂,并且越来越多的活组织检查是大多数病理实验室的挑战。因此,用于评估组织病理学癌细胞的自动化方法的发展是值。在这项研究中,我们使用了来自挪威队的624个整个乳腺癌(WSIS)乳腺癌。我们提出了一种级联卷积神经网络设计,称为H2G-NET,用于千兆子宫内病理学图像的语义分割。该设计涉及使用PATCH-WISE方法的检测阶段,以及使用卷积AutoEncoder的细化阶段。为了验证设计,我们进行了一个消融研究,以评估所选组分在管道上对肿瘤分割的影响。指导分割,使用等级取样和深热敷细化,在分割组织病理学图像时被证明是有益的。当使用细化网络后,我们发现了一种显着的改进,以便后处理产生的肿瘤分割热量。整体最佳设计在90个WSIS的独立测试集中实现了0.933的骰子得分。该设计表现优于单分辨率方法,例如使用MobileNetv2(0.872)和低分辨率U-Net(0.874)的聚类引导,Patch-Wise高分辨率分类。此外,代表性X400 WSI的分割〜58秒,仅使用CPU。调查结果展示了利用细化网络来改善修补程序预测的潜力。解决方案是有效的,不需要重叠的补丁推断或合并。此外,我们表明,可以使用随机采样方案训练深度神经网络,该方案同时在多个不同的标签上余下,而无需在磁盘上存储斑块。未来的工作应涉及更有效的补丁生成和采样,以及改进的聚类。
translated by 谷歌翻译
我们提出了两种小型无监督方法,用于消除文本中的毒性。我们的第一个方法结合了最近的两个想法:(1)使用小型条件语言模型的生成过程的指导和(2)使用释义模型进行风格传输。我们使用良好的令人措辞的令人愉快的释放器,由风格培训的语言模型引导,以保持文本内容并消除毒性。我们的第二种方法使用BERT用他们的非攻击性同义词取代毒性单词。我们通过使BERT替换具有可变数量的单词的屏蔽令牌来使该方法更灵活。最后,我们介绍了毒性去除任务的风格转移模型的第一个大规模比较研究。我们将模型与许多用于样式传输的方法进行比较。使用无监督的样式传输指标的组合以可参考方式评估该模型。两种方法都建议产生新的SOTA结果。
translated by 谷歌翻译
In this paper, we propose a novel technique, namely INVALIDATOR, to automatically assess the correctness of APR-generated patches via semantic and syntactic reasoning. INVALIDATOR reasons about program semantic via program invariants while it also captures program syntax via language semantic learned from large code corpus using the pre-trained language model. Given a buggy program and the developer-patched program, INVALIDATOR infers likely invariants on both programs. Then, INVALIDATOR determines that a APR-generated patch overfits if: (1) it violates correct specifications or (2) maintains errors behaviors of the original buggy program. In case our approach fails to determine an overfitting patch based on invariants, INVALIDATOR utilizes a trained model from labeled patches to assess patch correctness based on program syntax. The benefit of INVALIDATOR is three-fold. First, INVALIDATOR is able to leverage both semantic and syntactic reasoning to enhance its discriminant capability. Second, INVALIDATOR does not require new test cases to be generated but instead only relies on the current test suite and uses invariant inference to generalize the behaviors of a program. Third, INVALIDATOR is fully automated. We have conducted our experiments on a dataset of 885 patches generated on real-world programs in Defects4J. Experiment results show that INVALIDATOR correctly classified 79% overfitting patches, accounting for 23% more overfitting patches being detected by the best baseline. INVALIDATOR also substantially outperforms the best baselines by 14% and 19% in terms of Accuracy and F-Measure, respectively.
translated by 谷歌翻译
The recent increase in public and academic interest in preserving biodiversity has led to the growth of the field of conservation technology. This field involves designing and constructing tools that utilize technology to aid in the conservation of wildlife. In this article, we will use case studies to demonstrate the importance of designing conservation tools with human-wildlife interaction in mind and provide a framework for creating successful tools. These case studies include a range of complexities, from simple cat collars to machine learning and game theory methodologies. Our goal is to introduce and inform current and future researchers in the field of conservation technology and provide references for educating the next generation of conservation technologists. Conservation technology not only has the potential to benefit biodiversity but also has broader impacts on fields such as sustainability and environmental protection. By using innovative technologies to address conservation challenges, we can find more effective and efficient solutions to protect and preserve our planet's resources.
translated by 谷歌翻译
Variational autoencoders model high-dimensional data by positing low-dimensional latent variables that are mapped through a flexible distribution parametrized by a neural network. Unfortunately, variational autoencoders often suffer from posterior collapse: the posterior of the latent variables is equal to its prior, rendering the variational autoencoder useless as a means to produce meaningful representations. Existing approaches to posterior collapse often attribute it to the use of neural networks or optimization issues due to variational approximation. In this paper, we consider posterior collapse as a problem of latent variable non-identifiability. We prove that the posterior collapses if and only if the latent variables are non-identifiable in the generative model. This fact implies that posterior collapse is not a phenomenon specific to the use of flexible distributions or approximate inference. Rather, it can occur in classical probabilistic models even with exact inference, which we also demonstrate. Based on these results, we propose a class of latent-identifiable variational autoencoders, deep generative models which enforce identifiability without sacrificing flexibility. This model class resolves the problem of latent variable non-identifiability by leveraging bijective Brenier maps and parameterizing them with input convex neural networks, without special variational inference objectives or optimization tricks. Across synthetic and real datasets, latent-identifiable variational autoencoders outperform existing methods in mitigating posterior collapse and providing meaningful representations of the data.
translated by 谷歌翻译
Cashews are grown by over 3 million smallholders in more than 40 countries worldwide as a principal source of income. As the third largest cashew producer in Africa, Benin has nearly 200,000 smallholder cashew growers contributing 15% of the country's national export earnings. However, a lack of information on where and how cashew trees grow across the country hinders decision-making that could support increased cashew production and poverty alleviation. By leveraging 2.4-m Planet Basemaps and 0.5-m aerial imagery, newly developed deep learning algorithms, and large-scale ground truth datasets, we successfully produced the first national map of cashew in Benin and characterized the expansion of cashew plantations between 2015 and 2021. In particular, we developed a SpatioTemporal Classification with Attention (STCA) model to map the distribution of cashew plantations, which can fully capture texture information from discriminative time steps during a growing season. We further developed a Clustering Augmented Self-supervised Temporal Classification (CASTC) model to distinguish high-density versus low-density cashew plantations by automatic feature extraction and optimized clustering. Results show that the STCA model has an overall accuracy of 80% and the CASTC model achieved an overall accuracy of 77.9%. We found that the cashew area in Benin has doubled from 2015 to 2021 with 60% of new plantation development coming from cropland or fallow land, while encroachment of cashew plantations into protected areas has increased by 70%. Only half of cashew plantations were high-density in 2021, suggesting high potential for intensification. Our study illustrates the power of combining high-resolution remote sensing imagery and state-of-the-art deep learning algorithms to better understand tree crops in the heterogeneous smallholder landscape.
translated by 谷歌翻译